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1.  Introduction 


Breast  cancer  is  a  heterogeneous  disease  in  terms  of  presentation,  morphology,  molecular  profile 
and  response  to  therapy.  Gene  expression  profiling  has  identified  six  molecular  subtypes,  i.e.  luminal  A, 
luminal  B,  normal  breast-like,  HER2+,  basal-like  and  claudin-low,  that  are  associated  with  clinical  markers 
as  well  as  prognosis  and  survival  [1-4].  However,  it  is  well  established  that  the  intrinsic  molecular  profiles 
of  breast  tumors  are  not  sufficient  to  perfectly  predict  disease  outcome.  Increasing  evidence  indicates 
that  characteristics  of  the  breast  stroma  and,  perhaps  more  specifically,  interactions  between  the  tumor 
epithelium  and  stroma  influence  breast  cancer  progression  and  response  to  therapy.  Previous  work  in 
our  lab  has  demonstrated  that  gene  expression  signatures  in  human  stroma  can  predict  outcome  of  breast 
cancer  patients  independently  of  clinical  parameters  and  molecular  subtypes  [5].  To  expand  on  these 
results,  the  goal  of  this  project  is  to  identify,  define  and  formally  test  critical  pathways  mediating  tumor 
epithelial-stromal  communication  and  co-dependency  in  Triple-Negative  (TN)  breast  cancer  (defined  as 
tumors  lacking  expression  of  estrogen  receptor,  progesterone  receptor  and  human  epidermal  growth 
factor  receptor-2  (HER2)),  a  subtype  associated  with  poor  outcome.  We  hypothesize  that  by  defining  the 
tumor  stromal  pathways  associated  with  poor  outcome  in  TN  tumors,  we  will  uncover  mechanisms  for  co¬ 
evolution,  biomarkers  and  potential  therapeutic  targets.  Our  specific  aims  are  to  develop  coordinate 
stromal-epithelial  expression  signatures  for  a  cohort  of  co.  50  TN  breast  cancers  for  which  outcome  and 
follow-up  are  available,  to  identify  stromal-epithelial  gene  interaction  networks,  and  to  identify  and 
integrate  stromal-epithelial  gene  expression  and  microRNA  (miR)  signatures  associated  with  TN  breast 
tumors.  It  is  well  recognized  that  heterogeneity  is  a  key  factor  underlying  the  variability  in  patient 
response  to  treatment,  especially  in  TN  cases.  There  is  a  need  for  a  fuller  understanding  of  the  molecularly 
distinct  TN  subgroups  linked  to  outcome  and  the  development  of  more  personalized  treatment  strategies 
for  members  of  this  subgroup.  This  project  will  provide  the  first  integrated  in-depth  analysis  of  the 
contribution  of  tumor  stromal  processes  to  disease  heterogeneity,  and  will  position  the  tumor 
microenvironment  for  therapeutic  intervention.  This  project  also  promises  the  "next  generation"  of 
signatures  based  on  miR  that  are  stable  in  clinical  materials  and  can  be  developed  for  non-invasive  tests 
suitable  for  stratification  of  patients  for  chemotherapy,  monitoring  disease  progression  and,  in  the  long 
term,  for  early  detection  and  screening  for  metastatic  disease. 


4  |  P  a  g  e 


2.  Keywords 


Breast  cancer,  Triple-Negative,  epithelium,  stroma,  gene  expression,  microRNA,  laser  capture 
microdissection,  heterogeneity,  molecular  profiles,  tumor  microenvironment 

3.  Overall  Project  Summary 

3.1  Current  Objectives 

This  research  project  has  3  tasks  covering  3  years  (refer  to  Statement  of  Work  in  Appendix  1): 

1.  Develop  coordinate  stromal-epithelial  mRNA  expression  signatures  for  TN  tumors. 

2.  Identify  stromal-epithelial  gene  interaction  networks. 

3.  Identify  and  integrate  stromal-epithelial  miR  signatures  associated  with  TN  breast  tumors. 
The  objectives  for  this  project  in  its  second  year  (2014-2015)  were  as  follows: 

•  Identify  stromal  subclasses  of  TN  tumors  based  on  gene  expression. 

•  Develop  a  de  novo  bioinformatics  tool  to  identify  genes  modulating  cross-talk  between 
tumor  epithelium  and  tumor-associated  stromal  compartments. 

•  Investigate  miR  signatures  for  their  prognostic  value  by  using  linked  patient  outcome 
data. 

3.2  Results,  Progress  and  Accomplishments 

During  the  first  year  of  this  project,  we  successfully  isolated  epithelial  and  stromal  tissue  from 
banked  TN  tumor  samples  via  laser  capture  microdissection  (LCM)  (see  Figure  1  in  Appendix  2  for  a 
depiction  of  LCM;  methods  as  per  [6]).  RNA  was  extracted  from  epithelial  and  stromal  LCM  isolates  and 
subjected  to  microarray-based  gene  expression  profiling  via  Agilent  SurePrint  G3  8x60K  chips  using 
methods  based  on  those  previously  described  by  our  group  [5,  6].  In  the  second  year  of  this  project,  our 
first  goal  was  to  use  the  gene  expression  data  to  identify  stromal  subclasses  of  TN  tumors.  We  then 
wanted  to  identify  the  genes  modulating  cross-talk  between  the  tumor  epithelium  and  associated  stromal 
compartments.  Before  beginning  our  analyses,  we  verified  the  integrity  (tissue  specificity)  of  the 
normalized  gene  expression  data.  We  selected  the  most  variable  genes  (interquartile  range  (IQR)  >  2) 
across  all  samples  and  separated  this  geneset  unbiasedly  into  two  opposing  directions  using  the 
Partitioning  Around  Medoids  (pam)  function  from  the  cluster  package  in  R  [Maechler,  2015;  version  2.0.1; 
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http://wis.kuleuven.be/stat/robust/papers/2005/maechleretal-rpackagecluster-cran-2005.pdf].  We 
then  ranked  the  patients  from  lowest  to  highest  in  terms  of  expression  of  the  overall  geneset  (Figure  2, 
Appendix  2).  This  approach  orders  all  patient  samples  by  first  ranking  them  on  the  basis  of  expression  of 
these  characteristic  genes,  followed  by  summing  the  ranks.  Thus,  patients  with  the  smallest  sum  of 
expression  are  ranked  lowest  (right)  and  those  with  the  largest  sum  are  ranked  highest  (left).  This 
approach  revealed  that  the  epithelial  and  stromal  tissue  samples  are  distinct,  and  that  adjacent  normal 
tissue  can  be  distinguished  from  tumor  tissue  (Figure  2,  Appendix  2).  Therefore,  the  LCM  procedure  was 
successful  in  separating  epithelial  from  stromal  tissue,  as  well  as  tumor  from  adjacent  normal  tissue. 

After  confirming  the  integrity  of  the  gene  expression  data,  we  attempted  to  identify  stromal 
subclasses  using  a  clustering-based  class  discovery  approach  [1],  We  defined  subtypes  as  groups  of 
patients  with  similar  gene  expression  profiles  that  cluster  closely  together.  Flowever,  due  to  the 
complexity  of  the  gene  expression  profiles,  this  method  of  subtyping  masked  certain  clusters  of  genes,  i.e. 
patient  clusters  were  predominantly  driven  by  immune-related  genes  and  this  strong  immune  signal 
masked  the  effect  of  weaker  gene  clusters.  Therefore,  we  adopted  an  alternate  approach  that  classifies 
groups  of  genes  (gene  modules)  by  the  degree  to  which  they  co-vary  across  patient  samples.  This  method, 
referred  to  as  Weighted  Gene  Correlation  Network  Analysis  (WGCNA)  [7],  identified  co-modulated  groups 
of  genes  that  had  high  absolute  correlation  in  patient  stromal  and  epithelial  tissues.  The  gene  modules 
were  given  color  names  to  prevent  overt  assumptions  about  the  biology  of  the  module,  and  they  were 
associated  with  disease  recurrence  in  the  patients  (Figure  3,  Appendix  2).  In  total,  there  were  25  epithelial 
modules  and  24  stroma  modules  identified  (Figure  4,  Appendix  2). 

In  addition  to  intra-tissue  correlation  (Figure  4,  Appendix  2),  the  epithelial  and  stromal  gene 
modules  showed  inter-tissue  correlation  as  determined  by  Spearman  correlation  permutation  and 
hypergeometric  Fisher's  exact  test  (Figure  5,  Appendix  2).  Using  both  intra-  and  inter-tissue  correlations 
(Tables  1-3,  Appendix  3),  we  generated  a  correlation  network  of  gene  modules  (Figure  6,  Appendix  2). 
This  correlation  map  revealed  two  main  networks  with  five  network  hubs,  and  an  additional  four 
independent  epithelial-stromal  interactions.  Because  shared  genes  between  the  epithelial  and  stromal 
modules  could  result  in  a  high  correlation  score,  we  calculated  the  percentage  of  common  genes  between 
the  modules  (Table  1,  Appendix  3).  We  defined  distinct  but  correlative  modules  as  having  <20%  common 
genes.  With  the  identification  of  the  epithelial  and  stromal  modules,  as  well  as  the  evaluation  of  how  they 
relate  to  one  another,  we  are  ready  to  characterize  these  modules  according  to  biological  functions  and 
disease  outcome. 
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The  third  goal  for  the  second  year  of  this  project  was  to  investigate  miR  signatures  for  their 
prognostic  value  using  linked  patient  outcome  data.  During  the  first  year  of  this  project,  we  were  delayed 
in  profiling  the  miR  expression  in  tumor  and  normal  epithelium  and  stroma  as  a  result  of  technical 
difficulties.  While  the  profiling  is  now  completed,  data  normalization  is  still  ongoing.  Therefore,  we  have 
not  yet  assessed  the  miR  signatures  for  their  prognostic  value.  We  anticipate  that  the  normalized  miR 
data  will  be  available  by  December  2015  in  order  to  investigate  and  validate  miR  of  interest  as  outlined  in 
the  Statement  of  Work  (Appendix  1). 

3.3  Discussion 

We  have  previously  demonstrated  that  gene  expression  signatures  in  human  stroma  can  predict 
the  outcome  of  breast  cancer  patients  independently  of  clinical  parameters  and  molecular  subtypes  [5]. 
Moreover,  these  stromal  subclasses  have  been  shown  to  segregate  human  breast  tumors  by  disease 
outcome  and  contribute  significantly  to  tumor  heterogeneity.  Thus,  it  is  clear  that  further  investigation 
into  epithelial-stromal  interactions  is  imperative  to  our  understanding  of  breast  tumor  heterogeneity  and, 
as  such,  has  significant  implications  in  positively  influencing  patient  stratification,  treatment  and  survival. 
This  is  especially  true  for  TN  cases  which  represent  approx.  15%  of  all  breast  cancers  [8-11]  and,  as  a  result 
of  no  targetable  clinical  markers,  are  generally  treated  by  combined  surgery,  radiotherapy  and  non- 
targeted  chemotherapy.  Many  TN  tumors  display  a  good  response  to  anthracycline-  and  taxane-based 
chemotherapy,  especially  in  the  neo-adjuvant  setting  [10,  12,  13].  However,  overall  outcome  remains 
poor  in  TN  disease  [10]  and  no  mechanisms  exist  to  determine  which  patients  will  respond  to 
chemotherapy.  We  predict  that  interrogating  the  gene  expression  of  the  epithelium  and  surrounding 
stroma  in  TN  tumors  will  provide  insight  into  the  co-evolution  and/or  co-dependency  of  these  tissues,  and 
reveal  which  gene  signatures  are  associated  with  poor  outcome  as  well  as  foster  the  development  of  more 
personalized  treatment  strategies  for  patients  with  TN  breast  cancer. 

In  the  first  year  of  this  project,  we  successfully  profiled  the  gene  expression  of  tumor  epithelium 
and  associated  stroma  as  well  as  matched  normal  epithelium  and  stroma  of  ca.  50  TN  tumors.  During  the 
second  year  of  this  project,  we  initially  proposed  to  apply  a  "class  discovery"  bioinformatics  approach  to 
the  gene  lists  to  identify  stromal  subgroups.  However,  as  a  result  of  the  complexity  of  the  gene  expression 
profiles,  this  method  of  subtyping  masked  certain  clusters  of  genes.  Therefore,  we  adopted  an  alternate 
approach  (WGCNA)  that  classifies  groups  of  genes  (gene  modules)  by  the  degree  to  which  they  vary 
concurrently  across  patient  samples.  The  identification  of  epithelial  and  stromal  gene  modules  combined 
with  an  evaluation  of  their  correlation  allowed  us  to  generate  a  module  interaction  map  (Figure  6, 
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Appendix  2).  With  this  interaction  network  and  an  understanding  of  which  modules  are  "distinct"  (i.e. 
have  a  low  percentage  of  shared  genes),  we  are  beginning  to  define  the  biological  functions  of  our 
network  hubs  and  related  modules.  To  do  this,  we  are  performing  gene  set  enrichment  analysis  using 
tools  such  as  QIAGEN's  Ingenuity®  Pathway  Analysis  (IPA®,  QIAGEN  Redwood  City, 
www.qiagen.com/ingenuity)  and  the  Molecular  Signatures  Database  (MSigDB)  [14].  As  an  example, 
preliminary  IPA®  analysis  of  one  of  our  network  hubs,  Epithelial  (Epi)  Turquoise,  indicates  that  one  of  the 
canonical  pathways  represented  by  this  gene  list  is  Oxidative  Phosphorylation,  also  referred  to  as 
mitochondrial  respiration  (Figure  7,  Appendix  2).  The  fact  that  patients  with  high  expression  of  the  Epi 
Turquoise  genes  have  lower  reported  recurrence  (Figure  3,  Appendix  2)  suggests  that  TN  tumors  with  high 
levels  of  oxidative  phosphorylation  (versus  glycolysis)  are  more  sensitive  to  therapy  resulting  in  an 
improved  outcome.  This  is  consistent  with  reports  that  glycolytic  cancer  cells  are  resistant  to 
chemotherapeutic  agents  and  that  uptake  of  functional  mitochondria  by  cancer  cells  increases  drug 
sensitivity  [15].  In  addition,  mitochondrial  respiration  is  considered  an  important  source  of  oxidative 
stress  in  cancer  cells  [16]  and  it  has  been  shown  that  sensitivity  to  chemotherapy  is  increased  in  cancer 
cells  with  elevated  oxidative  stress  [17].  These  preliminary  results  are  very  encouraging  and  we  anticipate 
that  future  analyses  will  contribute  to  our  understanding  of  breast  cancer  heterogeneity  by  clarifying 
epithelial-stromal  gene  signatures  and  interactions  in  TN  breast  cancer. 

Changes  in  miR  expression  have  been  documented  in  breast  cancer  [18-22],  and  several  of  these 
have  been  shown  to  be  associated  with  clinical  features  [18,  23-28]  including  response  to  therapy  [29-32]. 
However,  little  is  known  regarding  the  prognostic  value  of  miR  sets  in  tumor  stroma,  particularly  in  TN 
breast  cancer.  The  third  objective  for  this  project  in  its  second  year  was  to  investigate  miR  signatures  for 
their  prognostic  value  using  linked  patient  outcome  data.  Because  of  a  delay  caused  by  technical 
difficulties,  this  objective  has  not  been  met.  However,  we  anticipate  that  the  normalized  miR  data  will  be 
available  by  December  2015  and  this  will  allow  us  to  investigate  and  validate  miR  of  interest  as  proposed. 

4.  Key  Research  Accomplishments 

The  following  key  research  accomplishments  have  contributed  to  our  major  objectives  of 
developing  coordinate  stromal-epithelial  mRNA  expression  signatures  for  TN  tumors  and  identifying 
stromal-epithelial  gene  interaction  networks: 
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•  Identification  of  the  epithelial  and  stromal  gene  modules  in  TN  patient  samples 

•  Generation  of  the  gene  module  correlation  network 

5.  Conclusion 

Heterogeneity  plays  a  substantial  role  in  the  variability  of  patient  response  to  treatment, 
especially  in  TN  cases.  A  fuller  understanding  of  the  molecularly  distinct  TN  subgroups  linked  to  outcome 
is  essential  to  promote  the  development  of  more  personalized  treatment  strategies.  We  have  previously 
demonstrated  that  gene  expression  signatures  in  human  stroma  can  predict  outcome  of  breast  cancer 
patients  independently  of  clinical  parameters  and  molecular  subtypes  [5].  To  expand  on  these  results, 
the  goal  of  this  project  is  to  identify,  define  and  formally  test  critical  pathways  mediating  tumor  epithelial- 
stromal  communication  and  co-dependency  in  TN  breast  cancer.  Our  specific  aims  are  to  develop 
coordinate  stromal-epithelial  expression  signatures,  to  identify  stromal-epithelial  gene  interaction 
networks,  and  to  identify  and  integrate  stromal-epithelial  gene  expression  and  miR  signatures  associated 
with  TN  breast  tumors.  To  date,  we  have  profiled  the  gene  and  miR  expression  of  distinct  epithelial  and 
stromal  compartments  from  co.  50  TN  tumors,  identified  epithelial  and  stromal  gene  expression  modules, 
and  generated  an  interaction/correlation  map  of  these  gene  modules.  These  are  critical  steps  towards 
accomplishing  our  major  objectives  and  understanding  the  relationship  between  the  tumor  and  its 
microenvironment.  Future  experiments  will  include  identifying  the  biological  functions  represented  by 
our  gene  modules,  characterizing  the  epithelial-stromal  interactions  associated  with  good  or  poor 
response  to  chemotherapy,  and  validating  gene  and  associated  miR  signatures  as  predictors  of  outcome 
using  patient  samples.  Although  other  groups  have  subtyped  TN  breast  cancer  (e.g.  [33],  [34]),  these 
molecular  studies  have  been  performed  on  whole  tumor  samples  with  >70%  epithelial  content.  Our 
project  will  provide  the  first  integrated  in-depth  analysis  of  the  contribution  of  tumor  stromal  processes 
to  disease  heterogeneity,  and  will  position  the  tumor  microenvironment  for  therapeutic  intervention. 

6.  Publications,  Abstracts  and  Presentations 

6.1  Poster  presentation 

C.  Thompson,  N.  Bertos,  T.  Gruosso,  G.  Finak,  R.  Lesurf,  S.  Saleh,  H.  Zhao,  M.  Souleimanova,  S. 
Meterissian,  A.  Omeroglu,  M.T.  Hallett  &  M.  Park.  A  new  breast  cancer  classification  scheme  based  on 
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novel  classes  of  tumor  stroma.  San  Antonio  Breast  Cancer  Symposium®  annual  meeting,  San  Antonio  TX, 
December  2014. 

7.  Inventions,  Patents  and  Licenses 

Nothing  to  report. 

8.  Reportable  Outcomes 

During  this  reporting  period,  we  generated  an  interaction  network  describing  the  relationships 
between  epithelial  and  stromal  gene  expression  modules  in  TN  tumors.  This  research  tool  is  an  essential 
achievement  in  our  strategy  to  identify,  define  and  formally  test  critical  pathways  mediating  tumor 
epithelial-stromal  communication  and  co-dependency  and,  ultimately,  understand  the  impact  of 
epithelial-stromal  interactions  on  triple  negative  breast  tumor  heterogeneity. 

9.  Other  achievements 

9.1  Training  and  Professional  Development 

As  the  Principal  Investigator  on  this  project,  I  have  had  the  opportunity  to  train  in  new  techniques 
and  improve  my  professional  skills  over  the  past  year.  With  the  assistance  of  collaborators  with  expertise 
in  bioinformatics  (e.g.  S.  Saleh),  I  have  learned  about  class  discovery,  WGCNA  and  gene  set  enrichment 
analysis.  In  addition,  I  have  met  routinely  with  Dr.  Bertos  and  my  mentor,  Dr.  Park,  to  discuss  technical 
and  theoretical  aspects  of  the  project  as  well  as  budgetary  concerns.  I  have  learned  how  to  use  the 
financial  systems  in  place  at  McGill  University  to  monitor  and  control  my  research  funds.  These 
meetings/training  have  contributed  to  my  abilities  in  project  management. 

My  project  location,  the  Goodman  Cancer  Research  Centre  at  McGill  University,  runs  a  weekly 
seminar  series  at  which  Principal  Investigators,  graduate  students  and  postdoctoral  fellows  present  their 
work.  In  addition,  invited  external  speakers  present  their  current  research  at  regular  seminars.  Many  of 
these  researchers  are  working  on  breast  cancer  projects  and  these  seminars  are  keeping  me  abreast  of 
current  trends  in  the  field.  They  also  provide  opportunities  for  collaborations  or  additional  training. 
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This  year,  I  trained  and  mentored  a  doctoral  candidate  in  her  research  project.  This  is  an 
important  piece  of  my  training  because  as  a  professor  with  my  own  lab,  I  will  be  training  and 
directing/mentoring  undergraduate  and  graduate  students. 

Finally,  because  I  am  interested  in  connecting  basic  research  with  the  clinic,  I  attended  the  San 
Antonio  Breast  Cancer  Symposium®  annual  meeting  in  December  2014.  The  Symposium's  mission  is  to 
provide  state-of-the-art  information  on  breast  cancer  research.  The  five-day  program  is  attended  by  a 
broad  international  audience  of  academic  and  private  researchers  as  well  as  physicians  from  over  90 
countries  and  aims  to  achieve  a  balance  of  clinical,  translational,  and  basic  research.  In  addition  to 
attending  presentations  covering  a  range  of  topics  such  as  patient-derived  xenografts  as  models  of 
metastasis,  the  reliance  of  HER2  pathology  on  HER3  and  the  most  recent  advances  in  immunotherapy,  I 
attended  a  career  development  forum  for  young  investigators  and  presented  a  poster  entitled  "A  new 
breast  cancer  classification  scheme  based  on  novel  classes  of  tumor  stroma."  There  was  a  great  deal  of 
interest  in  the  poster  presentation  and  I  was  able  to  interact  with  students,  post-doctoral  fellows,  Principal 
Investigators,  clinicians  and  breast  cancer  survivors.  It  was  a  great  opportunity  to  discuss  the  project, 
highlight  the  progression  of  the  research  and  brain-storm  future  directions  and  applications  of  our  results. 
Overall  the  experience  was  very  inspiring  and  expanded  my  vision  as  a  breast  cancer  researcher. 
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11.  Appendices 


11.1  Statement  of  Work 


Statement  of  Work 

Note:  All  work  will  be  performed  at  the  Goodman  Cancer  Research  Institute,  1160  Des  Pins  Avenue  West, 
Montreal  Quebec,  Canada,  H3 A  l  A3  unless  specified.  The  Principal  Investigator  (PI)  is  Dr.  Crista  Thompson 
and  the  Mentor  is  [)r.  Morag  Park. _ 


Task  Description  |  Year  1  |  Year  2  |  Year  3 

1.  Develop  coordinate  stromal-epithelial  mRNA  expression  signatures  for  T ri pie  negative  (TN)  tumors* 

*  Resource:  Dr.  Park  established  the  Breast  Cancer  Functional  Genomics  Group.  This  group  has  banked 
fresh-frozen  breast  cancer  tumor  (approx.  400)  and  normal  (approx.  500  including  matched  samples  and 
reduction  mammoplasties)  tissue  samples  obtained  from  surgeries  conducted  at  the  McGill  University 
Health  Centre  under  strict  quality  control  guidelines.  Blood  samples  collected  at  the  time  of  surgery  have 
heen  processed  as  serum  and  plasma  and  stored.  Matched  formalin-fixed  paraffin-embedded  (FFPE) 
samples  from  the  clinical  pathology  archive  can  be  obtained  when  feasible  and  tissue  micro  arrays  for 
hanked  samples  have  been  constructed  to  aid  large-scale  IITC  and  in  situ  hybridization  analyses.  An 
attending  clinical  pathologist  specializing  in  breast  pathology  rescores  all  hanked  samples  for 
consistency.  IIER2  Fluorescence  in  situ  hybridization  is  performed  to  confirm  IIER2  status  in  equivocal 
cases  and  p53  mutation  analysis  is  conducted  for  all  samples.  All  experimental  data  is  linked  to 
information  regarding  pathology  analysis,  therapy  and  disease  course.  Tissue  and  blood  collection  and 
participant  follow-up  providing  outcome  is  conducted  with  Research  Ethics  Board  approval. 

la  Conduct  laser  capture  microdissection  (LCM)  to  isolate  separate  epithelial 
and  stromal  compartments  from  banked  tumor  samples,  both  tumor-associated 
and  adjacent  normal  tissues. 

*  Collaborator/Personnel:  Dr.  Nicholas  Bertos  /  Hong  Zhao 

*  Samples  from  30  TN  patients  with  distant  recurrence  within  5  years 
and  20  TN  patients  with  no  recurrence  in  5  years  will  be  analyzed, 
therefore,  there  will  be  a  total  of  200  analyses  (50  samples  *  4  tissue 
c  om  partments/s  ampl  c). 

*  PI  Training:  Loam  how  to  perform  LCM. 

Months 

1-8 

lb.  Extract  RNA  from  epithelial  and  stromal  LCM  isolates  and  subject  to 
microarray- based  gene  expression  proliling. 

*  Coliaboralor/Personnel;  Dr.  Nicholas  Bertos  /  Hong  Zhao 

*  Profiling  will  be  performed  with  Agilent  Whole  Human  Genome 
4x44K  chips 

*  PI  Training:  l^eam  howr  to  extract  RNA  from  LCM  isolates. 

*  PI  Training:  Ijeam  how  to  perform  microarray-based  gene  profiling. 

Months 

6-12 

1c.  Identify  stromal  subclasses. 

*  Collaboralor/Personnel:  Dr.  Michael  Hallett  /  Sadiq  Saleh 

*  Methods:  Genes  defining  stromal  subclasses  will  demonstrate 
homogeneous  expression  within  the  corresponding  cluster,  as  well  <is 
heterogeneous  expression  outside  the  cluster  as  determined  by  variance 
component  analysis.  The  biological  functions  over-represented  in  each 
stroma  class  will  be  identified  by  performing  gene  set  enrichment 
analysis  and  testing  for  enrichment  against  multiple  ontological 
databases  including  Gene  Ontology  (GO),  the  Kyoto  encyclopedia  of 
genes  and  genomes  (KEGG)  and  Lis t2 List  (L2L). 

*  PI  Training:  I  .earn  about  class  discovery'  and  gene  set  enrichment 
analysis. 

Months 

1-6 

Milestone:  Complete  characterization  of  profiles  in  matched  normal  and  tumor  stroma  and  corresponding 
epithelia  to  reveal  relevant  tumor-associated  changes  and  epithelial-stromal  gene  expression  networks. 
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Task  Description 

Year  1 

Year  2  | 

Year  3 

2.  Identity  stromal-epithelial  gene  interaction  networks. 

2a.  Develop  a  de  novo  bioinfonnatics  tool,  STR-EPL  to  identify  genes 
modulating  cross-talk  between  tumor  epithelium  and  tumor- associated 
stromal  components, 

9  CoUaborator/Personnek  Dr.  Michael  H  all  ell  /  Sadiq  Saleh 

*  Resources:  A  comprehensive  database  of  >  1600  breast  cancer  specific 
gene  signatures  (BreastS igDB).  These  include  both  signatures  from  the 
literature  as  well  as  those  contained  in  public  databases  such  as 
MsigBD. 

*  Methods:  We  will  develop  a  stromal-epithelial  interaction  map  for  each 
prominent  subtype  combination  identified  in  task  1  using  a  variety  of 
established  and  new  informatics  tools. 

Months 

6-12 

Months 

1-3 

Milestone:  Development  of  a  new  bioinformatics  tool  STR-EPI  to  identify  stromal-epithelial  gene  signatures. 

2b.  Characterize  epithelial -stromal  subtypes  specifically  associated  with  good 
or  poor  response  to  chemotherapy. 

*  Collaborator/ Personnel:  Dr.  Michael  Hallett  /  Sadiq  Saleh 

*  Resource:  We  have  generated  a  human  gene  expression  data 

compendium  derived  from  22  publicly  available  datasets  that  contained 
patients  diagnosed  with  invasive  duelal  carcinoma  with  associated 
clinical  information.,  including  recurrence  status  (defined  as  distant 
metastasis  within  5  years),  survival,  and  immunohisloehemistry  results 
(currently  n  5 1 7  5  patients  containing  619  TN  p atients ). 

9  Methods:  Within  the  stromal  and  epithelial  datasets,  eaeh  gene  present 
will  be  ranked  as  a  univariate  predictor  of  recurrence  using  a 
parametric  test.  Ihese  predictors  w  ill  be  trained  using  a  Naive  Bayes 
Classifier  and  crossvali dated  under  a  leave-one-out  cross-validation 
scheme.  [lie  signature  will  be  re-trained  in  our  data  and  validated  using 
the  same  procedure  in  new  and  existing  gene  expression  datasets  with 
outcome  following  treatment  to  an  anthracycline-  and/or  laxane-based 
regimens  utilizing  our  breast  cancer  gene  expression  compendia 
mentioned  above. 

Months 

3-6 

2c.  Validate  STR-EPI  outcome  predictors. 

9  Collaborator/ Personnel:  Dr.  Nicholas  Berios  /  I long  Zhao 

*  Methods:  Outcome  predictors  w  ill  be  validated  by  reverse  transcriptase 
PCR  and  IHC//7?  situ  hybridization  using  available  matched  frozen 
and/or  archival  FFPE  tissue 

*  Methods:  Results  will  also  be  validated  with  a  tissue  microarray  (TMA) 
composed  of  samples  from  ^5 00  patients  treated  at  the  Met  fill 
University  Health  Centre  with  5 -year  follow-up  information. 

Months 

7-12 

Milestone:  Identification  and  validation  of  candidate  genes,  pathways  and  interaction  pairs  with  prognostic 
and/or  interventional  applicability. 
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Task  Description  |  Year  1  |  Year  2  |  Year  3 

3.  Identity  and  integrate  stromal  -epithelial  miRNA  (miR)  signatures  associated  with  TN  breast  tumors. 

3a.  Profile  tlie  miR  expression  in  tumor  and  normal  epithelium  and  stroma. 

*  Collaborator/ Personnel:  Dr.  Nicholas  Berios  / 1 long  Zhao 

*  Methods:  miR  will  be  isolated  from  Our  LCM  samples  specified  in 
Task  1.  The  concentration  will  be  assessed  and  quality  control 
performed  by  Nanodrop  spectrophotometer  and  Bioanalyzer  analyses. 
Die  miR  expression  will  be  profiled  using  the  NanoString  platform 
available  at  the  Innovation  Centre  (McGill  University). 
Reproducibility  will  be  assessed  by  quantile  normalization  of  biological 
replicates  and  the  mean  normalized  signal  from  biological  replicates 
will  be  used  for  comparative  expression  analysis. 

*  PI  Training:  Learn  how  to  extract  miR  from  LCM  isolates. 

Months 

6-12 

Milestone;  Collection  of  miR  expression  profiles  in  tumor  and  normal  epithelium  and  stroma. 

3b.  Investigate  miR  signatures  for  their  prognostic  value  by  using  linked 
patient  outcome  data. 

*  Collaborator/Personnel:  Dr.  Michael  Hallcll  /  Sadiq  Saleh 

*  Methods:  Differentially  expressed  miR  between  normal  and  tumor 
tissues  (epithelium-  or  stroma-derived)  will  be  identified  using  one-way 
analysis  of  variance  (  AN OVA,  p<0.5)  and  hierarchical  clustering  with 
Pearson  correlation  using  the  top  50  most  variably  expressed  miR. 
Differentially  expressed  miR  between  stromal  or  epithelial  samples  will 
be  identified  at  a  threshold  of  P  <  1  x  10-5,  using  the  LIMMA  package 
in  Bioconductor.  Die  miR  signatures  will  be  evaluated  for  their 
prognostic  value  using  linked  patient  outcome  data. 

*  PI  Training:  Learn  how  to  link  miR  signatures  to  patient  outcome. 

Months 

6-12 

3c.  Validate  miR  of  interest. 

*  Collaborator/Personnel:  Dr.  Nicholas  Bertos  /  Hong  Zhao 

*  Methods:  miR  of  interest  will  be  validated  via  in  situ  hybridization  on 
FFPE  sections  specified  in  Task  1. 

*  Methods:  PCR-based  assays  for  any  miR  that  correspond  with  tumor 
subtypes  we  previously  identified  will  be  established  such  that  the  miR 
can  be  used  as  biomarkers  in  TN  breast  cancer  patients. 

*  PI  Training:  Learn  how  to  quantify  miR  using  PCR-based  tests  or  in 
situ  hybridization. 

Months 

M2 

Milestone:  Identification  and  validation  of  miR  signatures  with  prognostic  value. 
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11.2  Figures 


Nature  Reviews  I  Genetics 

Figure  1:  Laser  capture  microdissection  (LCM)  is  a  technology  for  rapid  and  easy  procurement  of 
a  microscopic  and  pure  cellular  subpopulation  away  from  its  complex  tissue  milieu,  under  direct 
microscopic  visualization.  The  starting  material  can  be  frozen,  or  fixed,  and  stained.  A  thin  polymer  film  is 
placed  in  direct  contact  with  a  frozen  or  fixed  tissue  section  and  a  laser  beam  activates  the  polymer  and 
so  transfers  the  selected  cell(s)  out  of  the  tissue  and  onto  the  polymer  film.  This  positive  selection  method 
is  done  repeatedly  until  all  of  the  desired  tissue  is  embedded  onto  the  polymer  film.  An  extraction  buffer 
is  applied  to  the  polymer  film  so  that  DNA,  RNA  or  proteins  can  be  solubilized  from  the  captured  tissue 
cells.  LCM  fully  preserves  the  state  of  the  cell's  molecules  for  quantitative  analysis.  Adapted  from  [35]. 
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Figure  2:  LCM  successfully  isolates  distinct  compartments  of  the  tumor.  Separation  of  the  most 
variable  genes  (interquartile  range,  IQR  >  2)  unbiasedly  into  two  opposing  directions  using  the  Partitioning 
Around  Medoids  (pam)  function  and  subsequent  ranksum  ordering  of  gene  expression  profiles 
distinguishes  epithelial  from  stromal  tissue  (A),  and  normal  from  tumor  tissue  (B,  C).  Tissue  types:  Red 
(tumor  epithelium),  Pink  (normal  epithelium),  Dark  blue  (tumor  stroma),  Light  blue  (normal  stroma).  Rows 
represent  transcripts  and  columns  represent  patient  samples.  Values  are  centered  and  scaled  per 
transcript  across  all  samples  and  represented  by  the  color  key. 
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Figure  3:  Examples  of  gene  modules  for  epithelial  and  stromal  components  of  TN  tumors. 

Weighted  Gene  Correlation  Network  Analysis  (WGCNA)  [7]  identified  co-modulated  groups  of  genes  that 
had  high  absolute  correlation  in  patient  epithelial  (epi)  and  stromal  (str)  tissues.  The  gene  modules  were 
given  color  names  (e.g.  turquoise  and  red)  to  prevent  overt  assumptions  about  the  biology  of  the  module, 
and  they  were  associated  with  disease  recurrence  in  the  patients.  Rows  represent  genes  and  columns 
represent  patient  samples.  Values  are  centered  and  scaled  per  transcript  across  all  samples  and 
represented  by  the  color  key.  Patients  are  ordered  by  the  ranksum  of  module  gene  expression.  Five-year 
recurrence  is  scored  as  red  (disease  recurrence),  white  (no  recurrence)  and  grey  (no  recurrence,  but  less 
than  3  years  of  follow  up). 
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Figure  4:  Intra-tissue  correlation  of  epithelial  and  stromal  gene  modules  from  TN  tumors. 

WGCNA  identified  25  epithelial  and  24  stroma  gene  modules.  The  gene  modules  are  named  as  colors  to 
prevent  overt  assumptions  about  the  biology  of  the  module.  The  figure  shows  correlation  heatmaps  of 
all  the  genes  in  the  epithelial  and  stromal  tissues.  Genes  are  ordered  by  their  modules  (represented  by 
different  colors  on  the  axes)  and  separated  into  mid-size  (less  than  250  genes)  and  large  (greater  than  25 
genes)  modules  for  clarity  only. 
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Figure  5:  Inter-tissue  correlation  of  epithelial  and  stromal  gene  modules  from  TN  tumors. 

Heatmaps  of  the  correlation  of  the  gene  modules  (named  as  colors)  between  the  two  tissue  types  as 
determined  by  Spearman  correlation  permutation  (A)  and  hypergeometric  Fisher's  exact  test  (B). 
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Figure  6:  Correlation  network  of  the  epithelial  and  stromal  gene  modules.  Map  depicting  the 
correlations  between  epithelial  and  stromal  gene  modules.  The  modules  are  colored  according  to  their 
name.  Two  main  networks  with  5  hubs  (A,  B)  as  well  as  four  independent  epithelial-stromal  interactions 
(C,  D,  E,  F)  are  shown. 
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Figure  7:  Epi  Turquoise  contains  genes  involved  in  oxidative  phosphorylation.  Gene  set 
enrichment  analysis  of  Epi  Turquoise  using  QIAGEN's  Ingenuity®  Pathway  Analysis  (IPA®,  QIAGEN 
Redwood  City,  www.qiagen.com/ingenuity)  indicates  that  one  of  the  canonical  pathways  represented  by 
the  gene  list  is  Oxidative  Phosphorylation.  Genes  from  Epi  Turquoise  implicated  in  the  pathway  are 
outlined  in  pink. 


23  |  P  a  g  e 


11.3  Tables 


Table  1:  Correlation  and  common  genes  between  epithelial  and  stromal  gene  modules.  The 

correlation  between  epithelial  (epi)  and  stromal  (str)  modules  as  determined  by  hypergeometric  Fisher's 
exact  test  (p  value  <  1010,  +  positive  correlation,  -  negative  correlation)  and  Spearman  correlation 
permutation  (correlation  >  absolute  0.6,  +  positive  correlation,  -  negative  correlation).  The  total  number 
of  common  genes  between  epithelial  and  stromal  modules  are  shown  as  well  as  the  %  of  total  genes  in 
common.  We  defined  distinct  but  correlative  modules  as  having  <20%  common  genes  (highlighted  in 
yellow). 


Epi 

Hypergeometric 

Spearman 

Common  genes 

Str  |  +/- 

Str 

Correlation 

No.  common  genes 

Total  Epi  genes 

Total  Str  genes 

%  Epi 

%  Str 
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44 

93 
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47.3 
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49 
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7.7 

21.4 
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84 
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13.2 

9.9 

green-yellow 
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65 

337 
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19.3 
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grey  60 
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+ 

23 

139 

48 

16.5 

47.9 

light  green 

grey60 

+ 

grey60 

0.81 

33 

136 

110 

24.3 

30.0 

light  green 

light  green 

0.6 

1 

136 
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0.7 

1.0 

light  yellow 

light  yellow 

+ 

light  yellow 

0.73 

26 

128 

69 

20.3 

37.7 

light  yellow 

black 

+ 

34 
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26.6 

4.5 
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yellow 

+ 

62 
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15.3 

6.5 

midnight  blue 

cyan 

- 

19 

197 

160 

9.6 

11.9 

orange 

red 

+ 

21 

53 

787 

39.6 

2.7 

pink 

green 

+ 

67 

446 

852 

15.0 

7.9 

pink 

yellow 

+ 

57 

446 

956 

12.8 

6.0 

purple 

green 

+ 

42 

350 

852 

12.0 

4.9 

red 

pink 

+ 

172 

613 

614 

28.1 

28.0 

royal  blue 

dark  green 

+ 

26 

98 

54 

26.5 

48.1 

salmon 
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+ 
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0.64 
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752 
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18.2 
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-0.63 
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6.7 

15.1 
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+ 
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0.75 
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29.0 

34.9 
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-0.66 

26 

1777 

66 

1.5 

39.4 

yellow 
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+ 
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35.6 

31.1 

yellow 
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73 
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1475 

10.6 
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0.6 

2 

689 

66 

0.3 

3.0 
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Table  2:  Correlation  among  epithelial  gene  modules.  The  correlation  among  epithelial  modules 
as  determined  by  Spearman  correlation  permutation  (correlation  >  absolute  0.7,  +  positive  correlation,  - 
negative  correlation). 


Epi  1 

Epi  2 

Correlation 

dark  grey 

light  cyan 

0.76 

dark  grey 

green 

-0.82 

green 

dark  green 

-0.85 

light  cyan 

green 

-0.73 

midnight  blue 

dark  green 

-0.82 

pink 

purple 

0.7 

salmon 

light  yellow 

0.72 

turquoise 

red 

-0.89 

yellow 

turquoise 

-0.81 
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Table  3:  Correlation  among  stromal  gene  modules.  The  correlation  among  stromal  modules  as 
determined  by  Spearman  correlation  permutation  (correlation  >  absolute  0.7,  +  positive  correlation,  - 
negative  correlation). 
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Str2 

Correlation 

black 

yellow 

0.74 

blue 

royal  blue 

-0.75 

blue 

red 

-0.75 

blue 

pink 

-0.84 

cyan 

turquoise 

0.83 

cyan 

dark  turquoise 

0.71 

green 

tan 

-0.95 

green 
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-0.83 

green 
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grey60 

light  green 

0.75 

magenta 
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0.76 
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0.87 
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0.72 
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0.7 
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